Molecular Biology and Evolution — Latest Matching Preprints

1

The diverging evolutionary history of opsin genes in Diptera

Feuda, R.; Goulty, M.; Zadra, N.; Gasparetti, T.; Rosato, E.; Segata, N.; Rizzoli, A.; Pisani, D.; Ometto, L.; Rota-Stabelli, O.

2020-06-29 evolutionary biology 10.1101/2020.06.29.177931 medRxiv

Top 0.1%

70.3%

Show abstract

Opsin receptors mediate the visual process in animals and their evolutionary history can provide precious hints on the ecological factors that underpin their diversification. Here we mined the genomes of more than 60 Dipteran species and reconstructed the evolution of their opsin genes in a phylogenetic framework. Our phylogenies indicate that dipterans possess an ancestral set of five core opsins which have undergone several lineage-specific events including an independent expansion of low wavelength opsins in flies and mosquitoes and numerous family specific duplications and losses. Molecular evolutionary studies indicate that gene turnover rate, overall mutation rate, and site-specific selective pressure are higher in Anopheles than in Drosophila; we found signs of positive selection in both lineages, including events possibly associated with their peculiar behaviour. Our findings indicate an extremely variable pattern of opsin evolution in dipterans, showcasing how two similarly aged radiations - Anopheles and Drosophila - can be characterized by contrasting dynamics in the evolution of this gene family.Competing Interest StatementThe authors have declared no competing interest.View Full Text

2

A novel phylogenetic analysis combined with a machine learning approach predicts human mitochondrial variant pathogenicity

Akpinar, B. A.; Carlson, P. O.; Dunn, C. D.

2020-01-11 evolutionary biology 10.1101/2020.01.10.902239 medRxiv

Top 0.1%

66.7%

Show abstract

Linking mitochondrial DNA (mtDNA) variation to clinical outcomes remains a formidable challenge. Diagnosis of mitochondrial disease is hampered by the multicopy nature and potential heteroplasmy of the mitochondrial genome, differential distribution of mutant mtDNAs among various tissues, genetic interactions among alleles, and environmental effects. Here, we describe a new approach to the assessment of which mtDNA variants may be pathogenic. Our method takes advantage of site-specific conservation and variant acceptability metrics that minimize previous classification limitations. Using our novel features, we deploy machine learning to predict the pathogenicity of thousands of human mtDNA variants. Our work demonstrates that a substantial fraction of mtDNA changes not yet characterized as harmful are, in fact, likely to be deleterious. Our findings will be of direct relevance to those at risk of mitochondria-associated metabolic disease.

3

Multiple ancestral duplications of the red-sensitive opsin gene (LWS) in teleost fishes and convergent spectral shifts to green vision in gobies

Cortesi, F.; Escobar Camacho, D.; Luehrmann, M.; Sommer, G. M.; Musilova, Z.

2021-05-09 evolutionary biology 10.1101/2021.05.08.443214 medRxiv

Top 0.1%

65.3%

Show abstract

Photopigments, formed by an opsin protein bound to a light-sensitive chromophore, underlie vertebrate vision. Long-wavelength-sensitive (LWS) opsins mediate red-light detection, and most teleosts retain a single functional LWS1. A shorter-shifted green-sensitive paralog (LWS2) is found only in a few lineages. By mining teleost genomes and sequencing retinal transcriptomes, we identify elopomorphs as an additional lineage retaining LWS2 (alongside characins and osteoglossomorphs), and we reveal a previously overlooked shorter-shifted paralog, LWS3, restricted to gobies (Percomorpha) and arising from an ancient duplication. Structural modeling of twelve LWS opsins reveals convergent evolution at four key amino acid sites (214, 259, 261, 269) in the retinal-binding pocket, indicating convergent substitutions in human MWS, teleost LWS2, and goby LWS3 relative to red-sensitive counterparts, consistent with repeated spectral shifts toward green wavelengths. In several lineages--including characins, mormyrids, gobies, and primates--these shorter-shifted LWS opsins have functionally replaced the canonical green-sensitive RH2 opsin. Retinal transcriptomes and in situ hybridization demonstrate variable lws3 expression across gobies, with localization to a distinct double-cone member in Amblygobius phalaena, analogous to rh2 expression in other fish species. Together, these results show that repeated convergent evolution toward green sensitivity over 500 million years involves coordinated changes at the molecular, regulatory, and functional levels, providing a striking example of multilevel sensory adaptation. Significance statementVertebrate vision depends on opsins, which detect specific wavelengths of light. Most teleosts retain a single long-wavelength-sensitive opsin (LWS1), while a green-sensitive paralog (LWS2) occurs only in a few lineages. We identify a previously overlooked green-shifted opsin, LWS3, which is restricted to gobies, and show that LWS2 and LWS3 have repeatedly evolved green sensitivity through convergent amino acid changes. In parallel to the primate MWS/LWS evolution, these shorter-shifted opsins have functionally replaced the canonical green-sensitive RH2 opsin in multiple teleost lineages. By integrating genomic, structural, and expression data, we reveal how multilevel convergent evolution--from molecular tuning to photoreceptor specialization--has repeatedly shaped green-light vision over the past 500 million years of evolution, illustrating the remarkable flexibility of vertebrate visual systems.

4

Genome structural variants shape adaptive success of an invasive urban malaria vector Anopheles stephensi

Samano, A.; Chakraborty, M.; Liao, Y.; Ishtiaq, F.; Kumar, N.

2024-07-30 evolutionary biology 10.1101/2024.07.29.605641 medRxiv

Top 0.1%

64.8%

Show abstract

Global changes are associated with the emergence of several invasive species. However, the genomic determinants of the adaptive success of an invasive species in a new environment remain poorly understood. Genomic structural variants (SVs), consisting of copy number variants, play an important role in adaptation. SVs often cause large adaptive shifts in ecologically important traits, which makes SVs compelling candidates for driving rapid adaptations to environmental changes, which is critical to invasive success. To address this problem, we investigated the role SVs play in the adaptive success of Anopheles stephensi, a primary vector of urban malaria in South Asia and an invasive malaria vector in several South Asian islands and Africa. We collected whole genome sequencing data from 115 mosquitoes from invasive island populations and four locations from mainland India, an ancestral range for the species. We identified 2,988 duplication copy number variants and 16,038 deletions in these strains, with [~]50% overlapping genes. SVs are enriched in genomic regions with signatures of selective sweeps in the mainland and invasive island populations, implying a putative adaptive role of SVs. Nearly all high-frequency SVs, including the candidate adaptive variants, in the invasive island populations are present on the mainland, suggesting a major contribution of existing variation to the success of the island populations. Among the candidate adaptive SVs, three duplications involving toxin-resistance genes evolved, likely due to the widespread application of insecticides in India since the 1950s. We also identify two SVs associated with the adaptation of An. stephensi larvae to brackish water in the island and two coastal mainland populations, where the mutations likely originated. Our results suggest that existing SVs play a vital role in the evolutionary success of An. stephensi in new environmental conditions.

5

Pairs of compensatory frameshifting mutations contribute to evolution of protein-coding sequences in vertebrates and insects

Biba, D.; Klink, G. V.; Bazykin, G. A.

2020-12-26 evolutionary biology 10.1101/2020.12.25.424394 medRxiv

Top 0.1%

63.2%

Show abstract

Insertions and deletions of lengths not divisible by 3 in protein-coding sequences cause frameshifts that usually induce premature stop codons and may carry a high fitness cost. However, this cost can be partially offset by a second compensatory indel restoring the reading frame. The role of such pairs of compensatory frameshifting mutations (pCFMs) in evolution has not been studied systematically. Here, we use whole-genome alignments of protein coding genes of 100 vertebrate species, and of 122 insect species, studying the prevalence of pCFMs in their divergence. We detect a total of 619 candidate pCFM-genes; 11 of them pass stringent quality filtering, including three human genes: RAB36, ARHGAP6 and NCR3LG1. In some instances, amino acid substitutions closely predating or following pCFMs restored the biochemical similarity of the frameshifted segment to the ancestral amino acid sequence, possibly reducing or negating the fitness cost of the pCFM. Typically, however, the resulting sequence bore no biochemical similarity to the ancestral one, indicating that pCFMs can uncover radically novel regions of protein space. In total, pCFMs represent an appreciable and previously overlooked source of novel variation in amino acid sequences.

6

Deep learning analyses of DNA sequences resolve the retention of the Duffy-null resistance to Plasmodium vivax malaria in Africa

Laval, G.; Decugis, A.; Parasayan, O.; Patin, E.; Quintana-Murci, L.; Chiaroni, J.

2025-12-26 evolutionary biology 10.64898/2025.12.25.695976 medRxiv

Top 0.1%

60.8%

Show abstract

P. vivax, the most geographically widespread human malaria parasite with millions of clinical cases per year, is however quasi absent in sub-Saharan Africa. Positive selection targeting the rs2814778 protective mutation, also known as the Duffy-null allele, may explain the absence (or quasi absence) of vivax in sub-Saharan Africa by a progressive purge of the pathogen due to a quasi-fixation of the Duffy-null allele and the resulting high rates of protected carriers in western, central and eastern populations. Yet, while positive selection has been clearly evidenced in admixed populations coexisting with vivax, the selection model currently admitted poorly explains the lack of the Duffy-null allele in Europe, or in Asia where the pathogen is mainly observed. In this article, several validated Deep Learning methods applied to high coverage sequence data obtained in 589 African individuals resolved this retention of the Duffy-null resistance to vivax in Africa. The CNN and GAN algorithms implemented in this study also predict a rise in frequency of the Duffy-null mutation due to selection 25-35 kya years ago in the western part of Africa, a geographical region and a time frame overlapping with the rise of another protective mutation, {beta}S, the sickle-cell mutation protective at heterozygous state against the malaria caused by P. falciparum. In addition, the pattern of Duffy-null haplotypes highlights a quick spread of the Duffy-null allele in sub-Saharan Africa due to post-admixture selection events following the road of the recent Bantu expansion. Independent lines of evidence describing malaria as a life-threatening disease in West Africa from [~]30 kya, together with a rise in frequency followed by recent disseminations of the Duffy-null resistance, open new perspectives about both the history of malaria as a major human disease and the history of the main protective mutations in Africa.

7

Coevolution of Drosophila-type Timeless with Partner Clock Proteins

Bullo, E.; Chen, P.; Fiala, I.; Smykal, V.; Dolezel, D.

2024-12-25 evolutionary biology 10.1101/2024.12.25.628932 medRxiv

Top 0.1%

60.2%

Show abstract

Drosophila-type timeless (dTIM) is established key clock protein in fruit flies, regulating the rhythmicity and light-mediated entrainment. However, as indicated by functional experiments, its contribution to the clock differs in various insects. Therefore, we conducted a comprehensive phylogenetic analysis of dTIM across animals, dated its origin, gene duplications, and losses. We identified variable and conserved protein domains, and pinpointed animal lineages that underwent the biggest changes in the dTIM sequence. While dTIM modifications are only mildly affected by changes in the PER protein, even the complete loss of PER in echinoderms had no impact on dTIM. However, changes in dTIM always co-occur with the loss of CRYPTOCHROMES or JETLAG. This is exemplified by the remarkably accelerated evolution of dTIM in phylloxera and aphids. Finally, alternative d-tim splicing, characteristic of D. melanogaster temperature-dependent function, is conserved at least to some extent in Diptera, albeit with unique alterations. Altogether, this study pinpoints major changes that shaped dTIM origin and evolution.

8

Evolutionary dynamics of insect odorant receptors reveal ecological tuning shaping olfactory perception

Zhang, T.; Yang, X.; Fu, Y.; Xue, W.; Zhang, Y.; Duan, S.; Yin, Y.; Guo, Y.; Gao, C.; Liu, Y.; Li, G.; Xu, C.; Lu, H.

2026-02-13 evolutionary biology 10.64898/2026.02.12.705626 medRxiv

Top 0.1%

60.1%

Show abstract

Insect olfaction is facilitated by a heterotetrameric odorant receptor-odorant receptor co-receptor (OR-Orco) complex, which is distinct from that of vertebrate ORs. However, extreme sequence divergence among insect ORs has hindered a unified understanding of their evolutionary history and ecological importance. In this study, we present a multiscale analysis of OR genes across 115 insect species. We overcome the limitations of traditional phylogenetic approaches by applying a protein similarity network-based strategy and introduce a "trunk-branch" framework to systematically describe the evolutionary trajectories of insect ORs across sequence, structural, and functional levels. Although they possess different sequences and structural communities, all the insect orders were found to contain fully functional OR repertoires. Notably, insects adapted to end-Permian mass extinction through shifts in their functional OR repertoires, and early- and late-diverging lineages exhibit distinct patterns of OR differentiation. The emergence of Orco represents a key evolutionary transition point, marking the shift from a homomeric to a heteromeric complex accompanied by specialization of the extracellular domain and binding pocket. Furthermore, we established robust associations between olfactory recognition breadth and ecological variables, including diet, circadian rhythm, and habitat. Our findings provide a comprehensive framework for the evolution of insect ORs, explaining the complex adaptive relationship between insect olfactory potential and diverse ecological environments.

9

A thermodynamic model of protein structure evolution explains empirical amino acid rate matrices

Norn, C.; Andre, I.; Theobald, D. L.

2020-12-03 evolutionary biology 10.1101/2020.12.02.408807 medRxiv

Top 0.1%

59.8%

Show abstract

Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. Averaged over time and across proteins, these evolutionary pressures are sufficiently consistent to produce global substitution patterns that can be used to successfully find homologues, infer phylogenies, and reconstruct ancestral sequences. Although the factors which govern the variation of protein substitution rates has received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid rate matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi-nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex pattern of empirical rates observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary global driver behind the amino acid substitution patterns observed in proteins throughout the tree of life.

10

Standing genetic variation underlies divergence of nicotinic acetylcholine receptor subunits among cryptic species of the Anopheles gambiae complex

Fouet, C.; Rios, D.; Ashu, F.; Pinch, M.; Hernandez, C.; Ambadiang, M.; Kamdem, C.

2025-12-09 evolutionary biology 10.64898/2025.12.07.692827 medRxiv

Top 0.1%

59.0%

Show abstract

Arthropod species differ in insecticide susceptibility, yet how pre-existing polymorphism at target sites shapes variable responses within and among populations remains poorly understood. Recently diverged taxa provide ideal systems to test how target-site divergence modulates species sensitivity. Using whole-genome sequencing data from 573 mosquitoes representing six cryptic species of the Anopheles gambiae complex, we analyzed standing genetic variation across all 11 nicotinic acetylcholine receptor (nAChR) subunit genes to establish a baseline for natural diversity before the large-scale deployment of nAChR-targeting insecticides in Africa. We detected no previously reported resistance alleles from agricultural pests and found no evidence of selective sweeps or loss-of-function mutations across the nAChR gene family. Patterns of polymorphism were consistent with strong purifying selection. Most nonsynonymous variants were rare, predicted to be tolerated by SIFT (score [≥] 0.05), present almost exclusively in heterozygotes, and occurred outside ligand-binding and transmembrane domains. However, the 6 subunit exhibited relaxed constraint, with two high-frequency substitutions (I198M and D202E) that defined haplotypes segregating by species. The derived alleles represented ancient polymorphisms, showed evidence of introgression, and were fixed in populations with reduced larval susceptibility to spinosad. Our findings show that modest standing variation can shape divergence at insecticide target sites within a highly constrained gene family and underscore the need to monitor interspecific variation during the deployment of nAChR-targeting insecticides.

11

Purifying selection and adaptive evolution proximate to the zoonosis of SARS-CoV-1 and SARS-CoV-2

Townsend, J. P.; Gaughran, S.; Hassler, H. B.; Fisk, J. N.; Nagib, M.; Wu, Y.; Wang, Y.; Wang, Z.; Galvani, A. P.; Dornburg, A.

2023-08-07 evolutionary biology 10.1101/2023.08.07.552269 medRxiv

Top 0.1%

58.8%

Show abstract

Over the past two decades the pace of spillovers from animal viruses to humans has accelerated, with COVID-19 becoming the most deadly zoonotic disease in living memory. Prior to zoonosis, it is conceivable that the virus might largely be subjected to purifying selection, requiring no additional selective changes for successful zoonotic transmission. Alternatively, selective changes occurring in the reservoir species may coincidentally preadapt the virus for human-to-human transmission, facilitating spread upon cross-species exposure. Here we quantify changes in the genomes of SARS-CoV-2 and SARS-CoV-1 proximate to zoonosis to evaluate the selection pressures acting on the viruses. Application of molecular-evolutionary and population-genetic approaches to quantify site-specific selection within both SARS-CoV genomes revealed strong purifying selection across many genes at the time of zoonosis. Even in the viral surface-protein Spike that has been fast-evolving in humans, there is little evidence of positive selection proximate to zoonosis. Nevertheless, in SARS-CoV-2, NSP12, a core protein for viral replication, exhibited a region under adaptive selection proximate to zoonosis. Furthermore, in both SARS-CoV-1 and SARS-CoV-2, regions of adaptive selection proximate to zoonosis were found in ORF7a, a putative Major Histocompatibility Complex modulatory gene. These findings suggest that these replication and immunomodulatory proteins have played a previously underappreciated role in the adaptation of SARS coronaviruses to human hosts.

12

Inferring the landscapes of mutation and recombination in the common marmoset (Callithrix jacchus) in the presence of twinning and hematopoietic chimerism

Soni, V.; Versoza, C. J.; Jensen, J. D.; Pfeifer, S. P.

2025-07-04 evolutionary biology 10.1101/2025.07.01.662565 medRxiv

Top 0.1%

58.3%

Show abstract

The common marmoset (Callithrix jacchus) is an important model in biomedical and clinical research, particularly for the study of age-related, neurodegenerative, and neurodevelopmental disorders (due to their biological similarities with humans), infectious disease (due to their susceptibility to a variety of pathogens), as well as developmental biology (due to their short gestation period relative to many other primates). Yet, while being one of the most commonly used non-human primate models for research, the population genomics of the common marmoset remains relatively poorly characterized, despite the critical importance of this knowledge in many areas of research including genome-wide association studies, models of polygenic risk scores, and scans for the targets of selection. This neglect owes, at least in part, to two biological peculiarities related to the reproductive mode of the species -- frequent twinning and sibling chimerism -- which are likely to affect standard population genetic approaches relying on assumptions underlying the Wright-Fisher model. Using high-quality population genomic data, we here infer the rates and landscapes of mutation and recombination -- two fundamental processes dictating the levels and patterns of genetic variability -- in the presence of these biological features, and discuss our findings in light of recent work in primates. Our results suggest that, while the species exhibits relatively low neutral mutation rates, rates of recombination are in the range of those observed in other anthropoids. Moreover, the recombination landscape of common marmosets, like that of many vertebrates, is dominated by PRDM9-mediated hotspots, with artificial intelligence-based models predicting an intricate 3D-structure of the species-specific PRDM9-DNA binding complex in silico. Apart from providing novel insights into the population genetics of common marmosets, given the importance of the availability of fine-scale maps of mutation and recombination for evolutionary inference, this work will also serve as a valuable resource to aid future genomic research in this widely studied system.

13

Survivor bias drives overestimation of stability in reconstructed ancestral proteins

Thomas, A.; Evans, B. D.; van der Giezen, M.; Harmer, N. J.

2022-11-25 evolutionary biology 10.1101/2022.11.23.517659 medRxiv

Top 0.1%

55.5%

Show abstract

Ancestral sequence reconstruction has been broadly employed over the past two decades to probe the evolutionary history of life. Many ancestral sequences are thermostable, supporting the "hot-start" hypothesis for lifes origin. Recent studies have observed thermostable ancient proteins that evolved in moderate temperatures. These effects were ascribed to "consensus bias". Here, we propose that "survivor bias" provides a complementary rationalisation for ancestral protein stability in alignment-based methods. As thermodynamically unstable proteins will be selected against, ancestral or consensus sequences derived from extant sequences are selected from a dataset biased towards the more stabilising amino acids in each position. We thoroughly explore the presence of survivor bias using a highly parameterizable in silico model of protein evolution that tracks stability at the population, protein, and amino acid levels. We show that ancestors and consensus sequences derived from populations evolved under selective pressure for stability throughout their history are significantly biased toward thermostability. Our work proposes a complementary explanation of the origin of thermostability in the burgeoning engineering tools of ancestral sequence reconstruction and consensuses. It provides guidance for the thorough derivation of conclusions from future ancestral sequence reconstruction work.

14

A Pre-Bilaterian Origin of Phototransduction Genes and Photoreceptor Cells

Aleotti, A.; Giorgini, F.; Feuda, R.

2025-02-10 evolutionary biology 10.1101/2025.02.09.637293 medRxiv

Top 0.1%

55.4%

Show abstract

The evolution of vision is a major novelty in animals, playing a fundamental role in developing complex behaviours. Vision initiates with a light-triggered phototransduction cascade occurring in photoreceptor cells (PRCs). The two main PRC types, ciliary and rhabdomeric, employ both specific and common genes for phototransduction. Despite being crucial for vision, the origin and evolution of photoreceptor cells and phototransduction pathways remain unclear. Using phylogenetic methods, we studied the evolution of all phototransduction genes, elucidating their gene duplication patterns in over 80 species, including non-bilaterian metazoans and other eukaryotes. Then, we investigated the expression of phototransduction genes in available single-cell RNA-sequencing data from various animals, including non-bilaterians. By using phototransduction genes as markers, we identified putative photoreceptor-like cells across animals and compared their regulatory toolkits. Gene families encoding phototransduction components are generally ancient, predating the origin of vision. However, many phototransduction genes originated in the metazoan stem group. Moreover, putative photoreceptor cells identified in non-bilaterians appeared to express some but not all components of the two well-characterised phototransduction pathways, suggesting potential lineage-specific components involved in phototransduction. Finally, we identified conserved expression of certain transcription factors in putative PRCs in non-bilaterians, suggesting the homology of PRCs.

15

Reshaping Organellar Translation and tRNA Metabolism: The Consequences of Photosynthesis Loss and Massive Horizontal Gene Transfer

Ceriotti, L. F.; Gatica-Soria, L. M.; Prasad, K.; DeTar, R. A.; Warren, J. M.; Eichler, E.; Chustecki, J. M.; Elowsky, C.; Christensen, A. C.; Zhou, R.; Sloan, D. B.; Sanchez-Puerta, M. V. B.

2026-01-11 evolutionary biology 10.64898/2026.01.09.698701 medRxiv

Top 0.1%

55.4%

Show abstract

The transition to holoparasitism in plants precipitates the loss of photosynthesis, fundamentally altering the selective landscape acting on organellar genomes. These changes raise questions about the mechanisms by which the essential, coevolved machinery of translation responds to extreme genomic erosion and metabolic dependency. Integrating comparative genomics, tRNA sequencing, and subcellular localization assays, we elucidate the extensive rewiring of organellar translation systems and the tRNA-dependent tetrapyrrole biosynthesis pathway in the holoparasitic angiosperm family Balanophoraceae, which exhibits extreme reduction of tRNA content in plastid and mitochondrial genomes. We identified a rare evolutionary event: the putative intracellular transfer of the plastid initiator tRNA (tRNA-iMet) to the nucleus, which compensates for its loss from the plastid genome. We also demonstrate that the unusual UAG-to-Trp reassignment in the Balanophora plastid genetic code is driven by the loss of release factor pRF1 and the recruitment of a mutated nuclear tRNA-Trp. Furthermore, we reveal that the retention of organellar nuclear-encoded aminoacyl-tRNA synthetases is dictated by the presence/absence of cognate organellar tRNAs, which appear to be functional regardless of their foreign (horizontal transfer from the host plant) or native origins. Finally, we uncover a striking evolutionary asymmetry in nuclear-encoded ribosomal proteins: while plastid subunits exhibit elevated substitution rates consistent with relaxed selection and compensatory coevolution, mitochondrial subunits display high sequence conservation, likely maintaining compatibility with the extensive horizontal gene transfer observed in this lineage. Collectively, these findings represent some of the most extreme changes ever identified in the anciently conserved machinery of plant organellar translation.

16

Genomic Signatures of Microgeographic Adaptation in Anopheles coluzzii Along an Anthropogenic Gradient in Gabon

Daron, J.; Bouafou, L.; Tennessen, J.; Rahola, N.; Makanga, B.; Akone-Ella, O.; Ngangue, M.; Longo Pendy, N.; Paupy, C.; Neafsey, D. E.; Fontaine, M. C.; Ayala, D.

2024-05-19 evolutionary biology 10.1101/2024.05.16.594472 medRxiv

Top 0.1%

54.8%

Show abstract

Species distributed across heterogeneous environments often evolve locally adapted populations, but understanding how these persist in the presence of homogenizing gene flow remains puzzling. In Gabon, Anopheles coluzzii, a major African malaria mosquito is found along an ecological gradient, including a sylvatic population, away of any human presence. This study identifies into the genomic signatures of local adaptation in populations from distinct environments including the urban area of Libreville, and two proximate sites 10km apart in the La Lope National Park (LLP), a village and its sylvatic neighborhood. Whole genome re-sequencing of 96 mosquitoes unveiled[~] 5.7millions high-quality single nucleotide polymorphisms. Coalescent-based demographic analyses suggest an[~] 8,000-year-old divergence between Libreville and La Lope populations, followed by a secondary contact ([~]4,000 ybp) resulting in asymmetric effective gene flow. The urban population displayed reduced effective size, evidence of inbreeding, and strong selection pressures for adaptation to urban settings, as suggested by the hard selective sweeps associated with genes involved in detoxification and insecticide resistance. In contrast, the two geographically proximate LLP populations showed larger effective sizes, and distinctive genomic differences in selective signals, notably soft-selective sweeps on the standing genetic variation. Although neutral loci and chromosomal inversions failed to discriminate between LLP populations, our findings support that microgeographic adaptation can swiftly emerge through selection on standing genetic variation despite high gene flow. This study contributes to the growing understanding of evolution of populations in heterogeneous environments amid ongoing gene flow and how major malaria mosquitoes adapt to human. SignificanceAnopheles coluzzii, a major African malaria vector, thrives from humid rainforests to dry savannahs and coastal areas. This ecological success is linked to its close association with domestic settings, with human playing significant roles in driving the recent urban evolution of this mosquito. Our research explores the assumption that these mosquitoes are strictly dependent on human habitats, by conducting whole-genome sequencing on An. coluzzii specimens from urban, rural, and sylvatic sites in Gabon. We found that urban mosquitoes show de novo genetic signatures of human-driven vector control, while rural and sylvatic mosquitoes exhibit distinctive genetic evidence of local adaptations derived from standing genetic variation. Understanding adaptation mechanisms of this mosquito is therefore crucial to predict evolution of vector control strategies.

17

Ghost admixture in eastern gorillas

Pawar, H.; Cuadros, S.; de Manuel, M.; van der Valk, T.; Lobon, I.; Alvarez-Estape, M.; Haber, M.; Dolgova, O.; Han, S.; Ayub, Q.; Bautista, R.; Kelley, J. L.; Cornejo, O. E.; Lao, O.; Andres, A. M.; Guschanski, K.; Ssebide, B.; Cranfield, M.; Tyler-Smith, C.; Xue, Y.; Prado-Martinez, J.; Marques-Bonet, T.; Kuhlwilm, M.

2022-12-19 evolutionary biology 10.1101/2022.12.19.521012 medRxiv

Top 0.1%

54.8%

Show abstract

Archaic admixture has had a significant impact on human evolution with multiple events across different clades, including from extinct hominins such as Neanderthals and Denisovans into modern humans. Within the great apes archaic admixture has been identified in chimpanzees and bonobos, but the possibility of such events has not been explored in other species. Here, we address this question using high-coverage whole genome sequences from all four extant gorilla subspecies, including six newly sequenced eastern gorillas from previously unsampled geographic regions. Using Approximate Bayesian Computation (ABC) with neural networks to model the demographic history of gorillas, we find a signature of admixture from an archaic ghost lineage into the common ancestor of eastern gorillas, but not western gorillas. We infer that up to 3% of the genome of these individuals is introgressed from an archaic lineage that diverged more than 3 million years ago from the common ancestor of all extant gorillas. This introgression event took place before the split of mountain and eastern lowland gorillas, likely more than 40 thousand years ago, and may have influenced perception of bitter taste in eastern gorillas. When comparing the introgression landscapes of gorillas, humans and bonobos, we find a consistent depletion of introgressed fragments on the X chromosome across these species. However, depletion in protein-coding content is not detectable in eastern gorillas, possibly as a consequence of stronger genetic drift in this species.

18

Insights into RAG evolution from the identification of "missing link" family A RAGL transposons

Martin, E. C.; Le Targa, L.; Tsakou-Ngouafo, L.; Fan, T.-P.; Lin, C.-Y.; Xiao, J.; Su, Y.-H.; Petrescu, A.-J.; Pontarotti, P.; Schatz, D. G.

2023-08-20 evolutionary biology 10.1101/2023.08.20.553239 medRxiv

Top 0.1%

53.1%

Show abstract

A series of "molecular domestication" events are thought to have converted an invertebrate RAG-like (RAGL) transposase into the RAG1-RAG2 (RAG) recombinase, a critical enzyme for adaptive immunity in jawed vertebrates. The timing and order of these events is not well understood, in part because of a dearth of information regarding the invertebrate RAGL-A transposon family. In contrast to the abundant and divergent RAGL-B transposon family, RAGL-A most closely resembles RAG and is represented by a single orphan RAG1-like (RAG1L) gene in the genome of the hemichordate Ptychodera flava (PflRAG1L-A). Here, we provide evidence for the existence of complete RAGL-A transposons in the genomes of P. flava and several echinoderms. The predicted RAG1L-A and RAG2L-A proteins encoded by these transposons intermingle sequence features of jawed vertebrate RAG and RAGL-B transposases, leading to a prediction of DNA binding, catalytic, and transposition activities that are a hybrid of RAG and RAGL-B. Similarly, the terminal inverted repeats (TIRs) of the RAGL-A transposons combine features of both RAGL-B transposon TIRs and RAG recombination signal sequences. Unlike all previously described RAG2L proteins, PflRAG2L-A and echinoderm RAG2L-A contain an acidic hinge region, which we demonstrate is capable of efficiently inhibiting RAG-mediated transposition. Our findings provide evidence for a critical intermediate in RAG evolution and argue that certain adaptations thought to be specific to jawed vertebrates (e.g., the RAG2 acidic hinge) actually arose in invertebrates, thereby focusing attention on other adaptations as the pivotal steps in the completion of RAG domestication in jawed vertebrates.

19

Recurrent Evolutionary Innovations in Rodent and Primate Schlafen Genes

Mordier, J.; Fraisse, M.; Cohen-Tannoudji, M.; Molaro, A.

2024-01-13 evolutionary biology 10.1101/2024.01.12.575368 medRxiv

Top 0.1%

53.0%

Show abstract

SCHLAFEN proteins are a large family of RNase-related enzymes carrying essential immune and developmental functions. Despite these important roles, Schlafen genes display varying degrees of evolutionary conservation in mammals. While this appears to influence their molecular activities, a detailed understanding of these evolutionary innovations is still lacking. Here, we used in depth phylogenomic approaches to characterize the evolutionary trajectories and selective forces shaping mammalian Schlafen genes. We traced lineage-specific Schlafen amplifications and found that recent duplicates evolved under distinct selective forces, supporting repeated sub-functionalization cycles. Codon-level natural selection analyses in primates and rodents, identified recurrent positive selection over Schlafen protein domains engaged in viral interactions. Combining crystal structures with machine learning predictions, we discovered a novel class of rapidly evolving residues enriched at the contact interface of SCHLAFEN protein dimers. Our results suggest that inter Schlafen compatibilities are under strong selective pressures and are likely to impact their molecular functions. We posit that cycles of genetic conflicts with pathogens and between paralogs drove Schlafens recurrent evolutionary innovations in mammals.

20

Improved gene tree inference from removing alignment errors both from focal genes and when training substitution models

Wheeler, A. L.; Chatur, C.; Goodman, P. W.; Edgar, R. C.; Huttley, G. A.; Masel, J.

2025-12-02 evolutionary biology 10.64898/2025.12.01.691663 medRxiv

Top 0.1%

52.8%

Show abstract

Multiple Sequence Alignment (MSA) is a key step in phylogenetic analysis, and is prone to error. Unfortunately, algorithms that remove likely alignment errors from MSAs sometimes also remove informative residues, making phylogenetic tree inference worse. Here we present a novel MSA cleaning algorithm based on consensus between MSAs using a range of Hidden Markov Models and guide trees, named CLOAK (CLeaning On the basis of Alignment C(K)onsensus). CLOAK is a gentle filter, with a low false positive rate for removal from MSAs according to the BALiBASE benchmarks, while still removing a significant fraction of likely alignment errors. Gentle vs. stringent MSA filtering methods are appropriate for different tasks. We assess methods based on their ability to bring the gene trees of single copy orthologs closer to the accepted species tree. Amino acid substitution models trained on filtered MSAs improve gene tree inference, with stricter filtering methods providing the biggest model improvements. In contrast, it is gentler filtering of single gene MSAs that provides additional improvements to gene tree inference, with CLOAK performing best.